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ABSTRACT 

Mechanical unfolding of the fourth domain of Distyostelium discoideum fil- 
amin (DDFLN4) was studied by all-atom molecular dynamics simulations, using 
the GROMOS96 force field 43al and the simple point charge explicit water sol- 
vent. Our study reveals an important role of non-native interactions in the 
unfolding process. Namely, the existence of a peak centered at the end-to-end 
extension AR ~ 22 nm in the force-extension curve, is associated with breaking 
of non-native hydrogen bonds. Such a peak has been observed in experiments 
but not in Go models, where non-native interactions are neglected. We predict 
that an additional peak occurs at AR ~ 2 nm using not only GROMOS96 force 
field 43al but also Amber 94 and OPLS force fields. This result would stimulate 
further experimental studies on elastic properties of DDFLN4. 
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1. INTRODUCTION 



The last ten years have witnessed an intense activity in single molecule force spectroscopy 
experiments in detecting inter and intramolecular forces of biological systems to understand 
their functions and structures. Much of the research has been focused on elastic properties 
of proteins, DNA, and RNA, i.e, their response tq_an external force, following the seminal 
papers by Rief et al. and Tskhovrebova et al. The main advantage of this technique 
is its ability to separate out fluctuations of individual molecules from the ensemble average 
behavior observed in traditional bulk biochemical experiments. This allows one for studying 
unfolding pathways in detail using the end-to-end distance as a good reaction coordinate. 
Moreover, the single molecule force spectroscopy can be used to decipher the unfolding free 
energy landscape (FEL) of biomolecules 0, 4]. 

As cytoskeletal proteins, large actin-binding proteins play a key role in cell organiza- 
tion, mechanics and signaling [5|. During the permanent cytoskeleton reorganization, all 
involved participants are subject to mechanical stress. One of them is the fourth domain 
of Distyostelium discoideum filamin (DDFLN4), which binds different components of actin- 
binding protein. Therefore, understanding the mechanical response of DDFLN4 protein to 
an external force is of great interest. In recent Atomic Force Microscopy (AFM) experiments 



7|, Schwaiger et al. have shown that DDFLN4 (Fig. 1), which has seven /5-strands in the 
native state (NS), unfolds via intermediates. In the force-extension curve, they observed two 
peaks at the end-to-end extension Ai? ^11 and 22 nm {Material and Methods). However, 
using a Go model 8|], Li et al. ^ have also obtained two peaks but they are located at 
A/2 ^1.5 and 11 nm (see also Material and Methods). A natural question to ask is if the 
disagreement between experiments and theory is due to over-simplification of the Go mod- 
eling, where non-native interactions between residues are omitted. To answer this question, 
we have performed all-atom molecular dynamics simulations, using the GROMOS96 force 



field 43al 



9|] and the simple point charge (SPG) explicit water solvent [l^. In order to 



check robustness of our results, limited all-simulations have been carried out with the help 



of the Amber 94 



111 and OPLS [12| force fields. 



We have shown that, two peaks do appear at almost the same positions as in the exper- 
iments and more importantly, the peak at Ai? ^ 22 nm comes from the non-native 
interactions. This explains why it has not been seen in the previous Go simulations Q]. In 
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FIG. 1: (a) The native conformation of DDFLN4 (PDB code: IKSR) with seven /3-strands (A - G). 
(b) Schematic plot which shows backbone contacts (dotted lines) between /J-strands. All adjacent 
/3-strands are anti-parallel to each other. A, B, E, and D belong to the first /3-sheet, whereas the 
second one contains C, F, and G. (c) The solvated system in the orthorhombic box of water (cyan). 
VMD software H was used for a plot. 

our opinion, this result is very important as it opposes to the common belief [l^ that 
mechanical unfolding properties of proteins are governed by their native topology. In addi- 



tion, to two peaks at large Ai?, in agreement with the Go results 



J], we have also observed 



a maximum at Ai? ^ 2 nm. Because such a peak was not detected by the AFM experiments 



3], further experimental and theoretical studies are required to clarify this point. 



2. MATERIALS AND METHODS 



Simulation details. We used the GROMOS96 force field 43al H to model DDFLN4 which 



has 100 amino acids, and the SPG water model 



101 ] to describe the solvent. The Gromacs 



version 3.3.1 has been employed. The protein was placed in an orthorhombic box with the 
edges of 4.0, 4.5 and 43 nm, and with 25000 - 26000 water molecules (Fig. Wp)- 

In all simulations, the GROMAGS program suite jis], Q was employed. The equations of 



motion were integrated by using a leap-frog algorithm with a time step of 2 fs. The LINGS 
[itI was used to constrain bond lengths with a relative geometric tolerance of 10~^. We used 
the particle-mesh Ewald method to treat the long-range electrostatic interactions 181]. The 
nonbonded interaction pair-list were updated every 10 fs, using a cutoff of 1.2 nm. 

The protein was minimized using the steepest decent method. Subsequently, uncon- 
strained molecular dynamics simulation was performed to equilibrate the solvated system 
for 100 ps at constant pressure (1 atm) and temperature T = 300 K with the help of the 
Berendsen coupling procedure 19|. The system was then equilibrated further at constant 
temperature T = 300 K and constant volume. Afterward, the N-terminal was kept fixed 
and the force was applied to the C-terminal through a virtual cantilever moving at the con- 
stant velocity v along the biggest z-axis of simulation box. We have also performed limited 
simulations for the case when the N-terminal is pulled. 

During the simulations, the spring constant was chosen as k = 1000 kJ/(mol*nm^) ^ 1700 
pN/nm which is an upper limit for A; of a cantilever used in AFM experiments. Movement of 
the pulled termini causes an extension of the protein and the total force can be measured by 
F = k{vt — x), where x is the displacement of the pulled atom from its original position. The 
resulting force is computed for each time step to generate a force extension profile, which 
has peaks showing the most mechanically stable places in a protein. 

Overall, the simulation procedure is similar to the experimental one, except that pulling 
speeds in our simulations are several orders of magnitude higher than those used in experi- 
ments. In the N-terminal fixed case, we have performed simulations for v = 10^, 5 x 10^, 1.2 x 

n 

10^°, and 2.5 x 10^'^ nm/s, while in the AFM experiments one used v ~ 100 — 1000 nm/s |6|]. 
For each value of v we have generated 4 trajectories. In the C-fixed case, the simulations 
were carried out for v = 5 x 10^ nm/s and three trajectories. 

A backbone contact between amino acids i and j {\i — j\ > 3) is defined as formed if the 
distance between two corresponding Ca-atoms is smaller than a cutoff distance dc = 6.5 A. 
With this choice, the molecule has 163 native contacts. A hydrogen bond is formed provided 
the distance between donor D (or atom N) and acceptor A (or atom O) < 3.5A and the 
angle D-H-A > 145°. 

The unfolding process was studied by monitoring the dependence of numbers of backbone 
contacts and hydrogen bonds (HBs) formed by seven /3-strands enumerated as A to G (Fig. 
la) on the end-to-end extension. In the NS, backbone contacts exist between seven pairs 
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of /9-strands Pab, Paf, Pbe, Pcd, Pcf, Pde, and Pfg as shown in Fig. lb. Additional 
information on unfolding pathways was also obtained from the evolution of contacts of these 
pairs. To understand the nature of the third peak in the force-extension curve, we have also 
monitored the evolution of contacts formed by non-native /3-strands which emerge as the 
unfolding progresses. 

Validity of GROMOS96 force field 43al. It should be noted that the Gromos force 



field has 
proteins 



3een proved useful for studying structural dynamics and kinetics of peptides and 



20|. It can not only decipher folding mechanisms but also provide reasonable 



estimates for folding times 211]. While the steered molecular dynamics with NAMD 22] 
is widely used for studying mechanical properties of biomolecules for a decade, the pulling 
option has been recently implemented in the GROMACS software. Therefore, we want to 
check its validity for stretching proteins first. Having used speed v = 5 x 10^ nm/s, we 
can demonstrate (results not shown) that the force-extension profile for the domain 127 is 
similar to the result obtained by NAMD 23||. We have also carried out limited runs for 
domain 5 (DDFLN5) for comparison with DDFLN4. In agreement with the experiments 
, domain 4 is less stable and has one peak more than DDFLN5 (Fig. [2ti). Thus, the 
GROMACS software with the GROMOS96 force field 43al can serve as a reliable tool for 
studying protein mechanical unfolding. To ensure that this conclusion is correct, we have 



Ul and OPLS 
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also performed a limited set of simulations using all-atom Amber 94 
force fields. The simulations were carried out within the GROMACS program suite 15l. Il6|. 
A short survey on previous results obtained by experiments and Go simula- 
tions. In order to make the presentation transparent, let us briefly discuss the previous 
experimental and simulation results [2J]. Experimentally one observed two peaks in 



the force-extension curve . However, determining their precise location as a function 
of the end-to-end distance is not an easy task, because, on experiments, unfolding lengths 
were measured by fitting data to the worm-like chain model but not by direct peak spac- 
ing. Schwaiger et al. jo] reported that the first peak occurs due to unfolding of strands A 
and B and the loop between B and C. In this first unfolding event, the length changes by 
ALi = 14 — 15 nm, where L is the length parameter in the worm-like chain model. Assuming 
the distance between to neighboring amino acides a = 0.36 nm, they have shown that this 
length gain corresponds to full stretching of ~ 40 first residues. Since L and R have dif- 
ferent physical meanings, this does not mean that the first peak is located at the extension 
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FIG. 2: (a) The force-extension profiles for Domain 4 (black) and Domain 5 (red). The results 
were obtained by the all-atom simulations for the pulling speed v = 5 x 10^ nm/s. (b) The force- 
extension curve for DDFLN4 and pulling speed v = 5.8 x 10^ nm/s was obtained by using the 



Go model 



. Results were averaged over 40 trajectories. Arrows refer to positions of twopeaks 



(AR = 11 and 22 nm) which are expected to be the same as observed on the experiments a, lZ| . 
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AR = ALi. Here, we propose to estimate its location by comparing the experimental data 



with our simulation results 



24( 1 ■ Note that such a trick has been already used to locate the 



position of the hump in the force-extension curve for the titin domain 127 23|, |25|. Using the 



Go model 8| and the pulling speed v = 5.8 x 10^ nm/s, we obtain the the force-extension 
curve shown in Fig. [2)3. The first peak at AR ^1.5 nm in this theoretical curve cannot 
correspond to the first peak observed in the experiments for two reasons. First, this value 
of AR is too small to describe unfolding of 40 first residues. Second, the first experimental 
peak occurs due to unfolding of strands A and B 6j, while the theoretical peak corresponds 
to breaking of native contacts between strands A and F 2^ . The second theoretical peak at 
AR = 11± 1 nm (the error bar comes from averaging over different trajectories), could be 
identified as the first experimental peak because it also corresponds to unfolding of strands 
A and B 2J]. Thus, assuming that the Go model correctly captures the first experimental 



peak, one can infer that this peak is located at AR = 11± 1 nm. From the experimental 
data , we can estimate the distance between two peaks ~ 11 ± 1 nm. Therefore the 
second experimental peak is expected to locate at AR = 22 ± 2 nm. This peak was not 
observed in our previous Go simulations (Fig. [2b) 2^- But we will show that the all-atom 
models, where the non-native interactions are taken into account, can reproduce it. 



3. RESULTS 



3.1. Existence of three peaks in force-extension profile 

Since the results obtained for four pulling speeds {Material and Methods) are qualitatively 
similar, we will focus on the smallest v = 10^ nm/s case. The force extension curve, obtained 
at this speed, for the trajectory 1, can be divided into four regions (Fig. [3]): 

Region I (0 ^ AR ^ 2.4 nm). Due to thermal fluctuations, the total force fluctuates a 
lot, but, in general, it increases and reaches the first maximum fmaxi = 695 pN at AR ^ 
2.42 nm. A typical snapshot before the first unfolding event (Fig. [3]) shows that structures 
remain native-like. During the first period, the N-terminal part is being extended, but the 
protein maintains all /3-sheet secondary structures (Fig. [4b). Although, the unfolding starts 
from the N-terminal (Fig. Wp), after the first peak, strand G from the C-termini got unfolded 
first (Fig. [It and[4f). In order to understand the nature of this peak on the molecular level, 
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FIG. 3: Force-extension profile for trajectory 1 for v = 10^ nm/s. Vertical dashed lines separate 
four unfolding regimes. Shown are typical snapshots around three peaks. Heights of peaks (from 
left) are fmaxi = 695 pN, fmax2 = 704 pN, and fm.ax3 = 626 pN. 

we consider the evolution of HBs in detail. As a molecule departs from the NS, non-native 
hydrogen bonds are created and at AR = 2.1 nm, e.g., a non-native /3-strand between amino 
acids 87 and 92 (Fig. IDd) is formed. This leads to increase of the number of HBs between 
F and G from 4 (Fig. HJi) to 9 (Fig. H^). Structures with the enhanced number of HBs 
should show strong resistance to the external perturbation and the first peak occurs due to 



their unfolding 
Go simulations 



Fig. Hb). It should be noted that this maximum was also observed in the 
241 ] ■ but not in the experiments 6|, l7|. Both all-atom and Go simulations 



reveal that the unfolding of strand G is responsible for its occurrence. 

Region II (2Anm ^ Ai? ^ 13.4 nm): After the first peak, the force drops rapidly from 
695 to 300 pN and secondary structure elements begin to break down. During this period, 
strands A, F and G unfold completely, whereas B, C, D and E strands remain structured 
(see Fig. [3] for a typical snapshot). 
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FIG. 4: (a) The NS conformation is shown for comparison with the other ones, (b) A typical 
conformation before the first unfolding event takes place {AR ~ 2.1 nm). The yellow arrow shows 
a part of protein which starts to unfold. An additional non-native /3-strand between amino acids 
87 and 92 is marked by black color, (c) A conformation after the first peak, at AR ~ 2.8 nm, 
where strand G has already detached from the core, (d) The same as in (a) but 4 HBs (green 
color) between /5-strands are displayed, (e) The same as in (b) but all 9 HBs are shown, (f) The 
same as in (c) but broken HBs (purple) between F and G are displayed. 

Region III (13 Anm ^ AR < 22.1 nm): During the second and third stages, the complete 
unfolding of strands D and E takes place. Strands B and C undergo significant conforma- 
tional changes, losing their equilibrium hydrogen bonds. Even though a core formed by 
them remains compact (see bottom of Fig. [3] for a typical snapshot). Below we will show in 
detail that the third peak is associated with breaking of non-native hydrogen bonds between 
strands B and C. 

Region IV (AR ^ 22.1 nm:) After breaking of non-native HBs between B and C, the 
polypeptide chain gradually reaches its rod state. 

For the pulling speed v = 10^ nm/s, the existence of three peaks is robust as they are also 
observed in all three remaining trajectories (Fig. [5] and [6^). It is also clearly evident from 
Fig. [6h>, where the force-extension curve, averaged over 4 trajectories, is displayed. The 
positions of peaks fluctuate from trajectory to trajectory but within error bars the locations 
of first two peaks are in the reasonable agreement with the Go simulations 2^. The distance 
between the second and third peaks is about 12 nm (Fig. [6)d) and this is in accord with the 
experiments 

aa. 
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FIG. 5: The same as in Fig. [3] but for trajectory 2 (a) and 3 (b). Heights of peaks (from left) 
are fmaxi = 740 pN, fmax2 = 614 pN, and fmax3 = 563 pN for trajectory 2 and fmaxi = 685 pN, 
fmax2 = 773 pN, and fmaxs = 500 pN for trajectory 3. 
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FIG. 6: (a) The same as in Fig. [3] but for trajectory 4. Heights of peaks (from left) are fmaxi = 906 
pN, fmax2 = 652 pN, and fmax3 = 630 pN. (b) The averaged over 4 trajectories force-extension 
profile at V = 10^ nm/s. The corresponding peaks are fmaxi = 597 pN, fmax2 = 530 pN, and 
fmaxs = 398 pN. The distance between the second and third peaks is about 12 nm. 
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3.2. Robustness of three peaks against pulling speeds 



The question we now ask is if the existence of three peaks is independent of pulhng 
speeds. To this end, we performed simulations at various loading speeds and the results are 
shown in Fig. [7^. In accordance with the kinetic theory 26|], the heights of maxima decrease 
as V is lowered, but three peaks exist for all values of pulling speeds. Since the first peak 
was not observed in the experiments , a natural question emerges is whether it is an 
artifact of high pulling speeds used in our simulations. Except data at the highest value of 
V (Fig. Uh), within error bars three maxima are compatible. Therefore, in agreement with 
the coarse-grained simulation results 2J], the peak centered at AR ^ 2 nm is expected to 
remain at experimental loading rates. It remains unclear if this is a shortcoming of theory or 
of experiments because it is hard to imagine that a /^-protein like DDFLN4 displays the first 
peak at such a large extension AR ^11 nm. The force-extension curve of the titin domain 
127, which has a similar native topology, for example, displays the first peak at AR ^ 0.8 
nm 251 . From this prospect, our theoretical result for DDFLN4 is more favorable. One of 
possible reasons of why the experiments did not detect this maximum is related to a strong 
linker effect as a single DDFLN4 domain is sandwiched between Ig domains 127-30 and 
domains 131-34 from titin 61. Remember that in the case of the domain 127, the hump in 

n 

the force-extension curve was not observed in the first experiments [11 . It was experimentally 
only after the theoretical prediction for its existence j^]. It would be very 



detected 



interesting if the first peak predicted by our simulations will be confirmed by experiments. 



3.3. Importance of non-native interactions 

As mentioned above, the third peak at AR ^ 22 nm was observed in the experiments 



but not in Go models [4, l2j], where non-native interactions are omitted. In this section, 
we show, at molecular level, that these very interactions lead to its existence. To this end, 
for the first trajectory of the lowest pulling speed {v = 10^ nm/s), we plot the number of 
native contacts formed by seven strands and their pairs as a function of AR. The first peak 
corresponds to unfolding of strand G (Fig. [8^) as all (A,F) and (F,G) contacts are broken 
just after passing it (Fig. [8b). Thus, the structure of the first intermediate state, which 
corresponds to this peak, consists of 6 ordered strands A-F (a typical snapshot is given in 
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FIG. 7: (a) Force-extension profiles for three values of v shown next to the curves, (b) Dependence 
of heights of three peaks on v. Results are averaged over four trajectories for each value of v. 

Fig. St). 

The second unfolding event is associated with full unfolding of A and F and the drastic 
decrease of native contacts of B and C (Fig. [8^1). After the second peak only (B,E), (C,D) and 
(D,E) native contacts survive dHb). The structure of the corresponding second intermediate 
state contains partially structured strands B, C, D and E. A typical snapshot is displayed 
in top of Fig. [3l 
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FIG. 8: (a) Dependence of the number of native backbone contacts formed by individual strands 
on /S.R for v = 10^ nm/s. Arrows refer to positions of three peaks in the force-extension curve, (b) 
The same as in (a) but for pairs of strands, (c) The same as in (a) but for all contacts (native and 
non-native), (d) The same as in (c) but for HBs. 

Remarkably, for Ai? ^18 nm, none of native contacts exists, except very small fluctu- 
ations of a few contacts of strand B around Ai? ^ 22.5 nm (Fig. [8^). Such fluctuations 
are negligible as they are not even manifested in existence of native contacts between corre- 
sponding pairs (A,B) and (B,E) (Fig. [Hh") Therefore, we come to a very interesting conclusion 
that the third peak centered at Ai? 22.5 nm is not related to native interactions. This 

fin n 

explains why it was not detected by simulations [J, |2J] using the Go model [8[]. 

The mechanism underlying occurrence of the third peak may be revealed using the results 
shown in Fig. [St;, where the number of all backbone contacts (native and non-native) is 
plotted as a function of Ai?. Since, for Ai? ^ 18 nm, native contacts vanish, this peak 
is associated with an abrupt decrease of non-native contacts between strands B and C. 
Its nature may be also understood by monitoring the dependence of HBs on Ai? (Fig. [Hll), 
which shows that the last maximum is caused by loss of HBs of these strands. More precisely, 
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FIG. 9: (a) Dependence of the number of HBs between pairs of strands for the first trajectory and 
V = 10^ nm/s. Non-native HBs between strands B and C start to appear at AR ~ 15 nm (red 
curve). Their rupture leads to the maximum centered at AR ~ 22.4 nm. Upper snapshot shows 
five HBs between B an C (green) before the third unfolding event. Lower snapshot is a fragment 
after the third peak, where all HBs are already broken (purple dotted lines), (b) The same as in 
(a), but for trajectory 2. The upper snapshot shows the HBs between strands C and E before 
their rupture. Just after the last unfolding event, one HB between D and E also survives (lower 
snapshot). 

five HBs between B and C, which were not present in the native conformation, are broken 
(Fig. [9^). Interestingly, these bonds appear at AR ^ 15 nm, i.e. after the second unfolding 
event (Fig. [9k). Thus, our study can not only reproduce the experimentally observed peak at 
AR ^ 22 nm, but also shed light on its nature on the molecular level. From this perspective. 
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all-atom simulations are superior to experiments. 

One corollary from Fig. [8] is that one can not provide a complete description of the 
unfolding process based on the evolution of only native contacts. It is because, as a molecule 
extends, its secondary structures change and new non-native secondary structures may occur. 
Beyond the extension of 17-18 nm (see snapshot at bottom of Fig. [3]), e.g., the protein lost 
all native contacts, but it does not get a extended state without any structures. Therefore, a 
full description of mechanical unfolding may be obtained by monitoring either all backbone 
contacts or HBs, as these two quantities give the same unfolding picture (Fig. ^ and[8ll). 

As said above, for trajectory 1 the third peak occurs due to breaking non-native HBs 
between strands B and C, but this mechanism is not universal for all trajectories we have 
studied. In the case of second trajectory, it is associated with breaking of the non-native 
HBs between strands C and E (Fig. [9h>), which have been created during the stage HI (Fig. 
[5^). After the rupture of the non-native HBs, a partial recovery of the D and E strands at 
AR ~ 24 nm has been observed. However, the further unfolding of refolded pieces of these 
strands does not affect the force-extension profile significantly and its influence is much less 
pronounced compared non-native HBs between C and E (Fig. [9b). 

For the first two trajectories, it was enough to consider the evolution of native and non- 
native HBs between the strands formed in the native state. However, for the third and fourth 
trajectories, determining the nature of the third peak requires consideration of additional 
/^-strands that appear during the unfolding process. In the case of trajectory 3, the native 
strand D has been extended for 3 amino acids more (from amino acid 54 to 59) and this 
extension is labeled as D^. In the upper snapshot of Fig. [TOk . this non-native part is shown 
in black. The red curve in this figure clearly shows that the third peak appears due to the 
breaking of four non-native HBs between the C and strands. After this unfolding event 
HBs between all pairs are ruptured. 

For trajectory 4, HBs between all native strands were broken before the protein reaches 
AR ^ 15 nm (Fig. [TOb). The nature of the third peak is purely related to the rupture of 
non-native HBs between strand B and the non-native strand (from amino acid 17 to 21) 
which is described by the black arrow (upper snapshot of Fig. [TOb). Thus , the non-native 
interactions are responsible for occurrence of the last peak, but individual pathways show a 
rich diversity. 
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FIG. 10: (a) The same as in Fig. [9l but for trajectory 3. The upper snapshot shows HBs between 
C and the non-native strand D"*" (black) which includes amino acids 54-59. The rupture of these 
bonds (lower snapshot) leads to the occurrence of the third peak, (b) The same as in (a), but for 
trajectory 4. Here the nature of the last peak is related to breaking of HBs between B and the 
non-native strand B^" (black, amino acids 17-21) as shown in the upper snapshot. 

3.4. Unfolding pathways: N-fixed case 

To obtain sequencing of unfolding events at v = 10^ nm/s, we use dependencies of the 
number of HBs on AR. From Fig. [Hli and Fig. [11], we have the following unfolding pathways 
for four trajectories: 
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(1) 

Although these pathways are different, they share a common feature that the C-terminal 
unfolds first. This is consistent with the results obtained by Go simulations at high pulling 
speeds f ~ 10^ nm/s 24 1, but contradicts to the experiments j^, Q], which showed that 
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FIG. 11: The end-to-end extension dependence of the number of HBs, formed by seven strands, 
for trajectory 2, 3 and A. v = 10^ nm/s. 



strands A and B from the N-termini unfold first. On the other hand, our Go simulations [24 1 
have revealed that the agreement with the experimental results is achieved if one performs 
simulations at relatively low pulling speeds v ~ 10^ nm/s. Therefore, one can expect that 
the difference in sequencing of unfolding events between present all-atom results and the 
experimental ones is merely due to large values of v we used. In order to check this, one has 
to carry out all-atom simulations, at least, at f ~ 10^ nm/s, but such a task is far beyond 
nowaday computational facilities. 

Schwaiger et al. suggested that conformations of the single unfolding intermediate 

which corresponds to the first peak in the force-extension curve contain five strands C-G 
with A and B fully unstructured. As follows from our all-atom simulations, there exist two 
intermediates related to peaks located at AR ~ 2 and 12 nm. Since at AR ~ 2 nm, only G 
unfolds (Fig. Ht, [Hli and fTTl) . the structure of the first intermediate consists of six structured 
strands A-F which is more ordered than the experimental one. The second intermediate, 
located at AR ~ 12 nm, is less ordered having four strands B-E structured (Fig. El [H [6], [HH 
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and [TTI) . Again, a huge difference in pulling speeds is a reason for differences in the nature 
of experimentally and theoretically predicted intermediates. 

It is very important to note that unfolding pathways depend on pulling speeds but the 
number of peaks in the force-extension curve remains unchanged. If non-native interactions 
are neglected as in the Go model then one has two peaks 2J], but the inclusion of these 
interactions leads to occurrence of the third maximum at Ai? ~ 22 nm (Fig. [7]). Thus, the 
main meaning of the present work is not in describing unfolding pathways but in capturing 
this peak. 



3.5. Unfolding pathways: C-fixed case 

It is known that unfolding pathways may depend on what end of a protein is kept fixed. 
For ubiquitin, pulling at the C-terminal gives pathways different from those obtained in the 



case when the N-end is pulled [28[. However, unfolding pathways of the domain 127 do not 
depend on what terminal is fixed j28|. To study the effect of terminal fixation on unfolding 
pathways of DDFLN4, we have performed limited simulations a.t v = 5 x 10^ nm/s for three 
trajectories. As in the N- fixed case, the force-extension curves display three peaks (Fig. [T2l) . 
Thus, the number of peaks does not depend on what terminal is immobilized. 

Fig. [T^ shows the dependence of HBs on AR for trajectory 1. Before the last unfolding 
event, there exist only few non-native HBs between strands C and E (upper snapshot). The 
lower snapshot shows that after the third peak these bonds got ruptured. One can show that 
the third peak in the second and third trajectories also comes from the rupture of non-native 
HBs (results not shown). Thus, the nature of the third peak remains essentially the same 
as in the N-fixed case. 

From the dependencies of the total number of backbone contacts formed by secondary 
structures (Fig. [Ml) , we obtain the following unfolding pathways: 

G ^ F ^ B ^ {C,E) ^ D, Trajectory 1, 
G ^ B ^ F ^ {C,D,E), Trajectory 2, 
A^ B ^ G ^ {G,F) ^ {E,D), Trajectory 3. (2) 



In three cases, the strand A is unfolds first (Eq. ([2])). For the third trajectory A anc 
unfold first and this scenario is in accord with the experiments of Schwaiger et al. j^, 7| 
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C-fix, v = 5*10^ nm/s 
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FIG. 12: The force-extension profiles, obtained in three molecular dynamics runs with the C- 
terminal kept fixed, v = 5 x 10^ nm/s. Arrows refer to positions of maxima. 

However, such an agreement is fortuitous as it happens just in one case. As evident from 
Eqs . ([T]) and ([2]), unfolding pathways depend on what terminal is pulled. Intuitively, it 
is clear because at high pulling speeds a terminal, which the external force is applied to, 
should unfold first as it unfolds before the force propagates to the opposite end. 

One can infer that the first intermediate, which corresponds to the first peak, consists of 
conformations with only strand A unfolded ( Figs. [T2] and [T4l) . This is valid for all three 
trajectories. Structures of the second intermediate, related to the second unfolding event, 
show more diversity. For the first trajectory, it consists of 4 structured strands B-E (Fig. 
[H^). After the second peak. A, F and G become unstructured. The second and third 
trajectories have the same second intermediate which contains four strands C-F (Fig. [T^ 
and Overall, as in the N-fixed case, due to high pulling speeds, the simulations give 

different unfolding pathways compared to the experimental ones. Nevertheless, the third 
peak, which is absent in the Go modeling, occurs in both N-fixed and C-fixed all-atom 
simulations. 
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FIG. 13: Dependence of the number of HBs between pairs of strands for the first trajectory. The 
pulhng speed v = 5 x 10^ nm/s and the C-end is hold fixed. The upper snapshot shows non-native 
HBs between strand C and E before their rupture. After passing the third peak they are broken 
(lower snapshot). 
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FIG. 14: Dependence of the total number of all backbone contacts (native and non-native), formed 
by seven strands, on AR. f = 5 x 10^ nm/s and the C-end is immobile. 

The question we now ask is whether the dependence of unfolding pathways on the terminal 
fixation is intrinsic for DDFLN4 or this is an artifact of high loading rates. Since it is 
impossible to carry out all-atom simulations at small v, we have performed the simulations 
at ?; ~ 10^ nm/s using the Go model 8| with the C-end hold fixed. The results, obtained 
for the A^-fixed case, were reported previously 2^- It turns out that at low pulling speeds 
unfolding pathways are the same as in the the A^- fixed case (results not shown). Therefore, 



we speculate that, similar to the domain 127 28|, at low loading rates unfolding pathways 



of DDFLN4 do not depend on the way one pulls it. Our speculation is not unreasonable 
because for small v, it does not matter what end is pulled as the external force is uniformly 
felt along a chain . Then, a strand, which has a weaker link with the core, would unfold 
first. 
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3.6. Unfolding intermediates and pathways at low pulling speeds 



Because all-atom simulations at low v are prohibited, in this section, we try to infer 
information on unfolding intermediates and pathways using the energetic argument and 



results followed from the Go model [2j]. As shown in our previous work [2j], the peak, 
centered at AR ^ 2 nm, results from breaking of contacts between A and F. Moreover, 
within the Go modeling, contacts between these strands break first not only at low but also 
at high loading rates. Our high-z; all-atom results (Fig. [Hb and similar results for other cases 
are not shown) also confirm this. In order to see if (A,F) native contacts break first at low 
loading rates, we calculated the interaction energies between terminal pairs (A,B), (A,F), 
and (F,G) using the Gromos force field 43al and equilibrium conformations. The latter 
are those conformations which have been obtained after the equilibration step and used 
as initial conformations for pulling. At the room temperature, they are very close to the 
native conformation (RMSD ^ 0.7 A). Since the interaction energies Eab ~ -120.4, Eaf ~ 
-12.2, and Epc ~ -114.4 kcal/mol, one can expect that (A,F) contacts always break first. 
Therefore, in the first intermediate state, all strands remain structured but the contacts 
between A and F are lost. As mentioned above, the absence of the first peak in experiments 
may be due to the strong linker effect. Another possible reason is that one needs a relatively 
small amount of energy to break (A,F) couplings. This subtle effect may be overlooked in 
the experiments. 



At low V, the Go simulations [2J] show that A and B unfold first and this pathway agrees 
very well with the experiments of Schwaiger et al. 6|, [7|. Consequently, the corresponding 
theoretical intermediate, which consists of 5 strands C-G, is also identical to the experimental 
one. Since the second peak at Ai? ^11 nm is caused by native interactions, its nature would 
be the same as in the Go model. Thus, we expect that at low pulling speeds, the second 
intermediate provided by the Gromos force field coincides with the experimental one. 

3.7. Robustness of the results against other force fields 

To make sure that the results obtained by the GROMOS96 force field 43al are robust, 
we have also performed a limited set of simulations for w = 5 x 10^ nm/s using the Amber 94 



III and OPLS [12| force fields. Fig. [TSl shows two typical force-extension curves obtained by 
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v=5*10^ nm/s 




eiid-to-end extention (iim) 

FIG. 15: Shown is the force-extension profile obtained by using the Amber 94 (upper panel) and 
OPLS (lower panel) force fields. The N-end is kept fixed and the pulling speed v = 5 x 10^ nm/s. 

these force fields (other similar curves are not shown) for the pulling speed v = 5x10^ nm/s 
and the N-terminus is hold fixed. In accordance with the GROMOS96 force field 43al case, 
the Amber 94 and OPLS force fields provide three peaks and the nature of the third peak 
is also associated with non-native interactions (results not shown). Thus, the existsence of 
three peaks and their nature do not depend on the force fields we used and one can expect 
that this conclusion remains valid for other existing force fields. 
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CONCLUSIONS 



We used the GROMOS96 force field 43al with explicit water to understand the mechan- 
ical unfolding of the protein domain DDFLN4. The validity of this approach was carefully 
checked not only by considering the well-studied titin domain 127 and the domain DDFLN5 
which is mechanically more stable than DDFLN4, but also by comparison with results ob- 
tained by Amber 94 and OPLS force fields. We have reproduced the experimental result 
on existence of two peaks located at AR ^11 and 22 nm. Our key result is that the later 
maximum occurs due to breaking of non-native HBs. It can not be encountered by the Go 
models in which non-native interactions are neglected [4, 24|. Thus, our result points to the 
importance of these interactions for the mechanical unfolding of DDFLN4. For the first time 
we have shown that the description of elastic properties of proteins may be not complete 
ignoring non-native interactions. This finding is valuable as the unfolding by an external 
force is widely believed to be solely governed by native topology of proteins. 



Our all-atom simulation study supports the result obtained by the Go model % 124 1 
that an additional peak occurs at AR ^ 2 nm. However, it was not observed by the AFM 
experiments of Schwaiger et al In order to solve this controversy, one has to carry 

out not only additional experiments, but also all-atom simulations at lower pulling speeds 
which are beyond nowaday computational facilities. 
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